AITopics

Country:

Europe > France (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(5 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (0.92)

Industry:

Leisure & Entertainment > Games (1.00)
Health & Medicine > Therapeutic Area (0.67)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(5 more...)

Ngouonou, Jovial Cheukam, Gindullin, Ramiz, Quimper, Claude-Guy, Beldiceanu, Nicolas, Douence, Remi

Incremental Selection of Most-Filtering Conjectures and Proofs of the Selected Conjectures

arXiv.org Artificial IntelligenceNov-4-2025

We present an improved incremental selection algorithm of the selection algorithm presented in [1] and prove all the selected conjectures.

artificial intelligence, conjecture, constraint, (13 more...)

2511.00194

Country:

North America (0.28)
Europe > France (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Neural Information Processing SystemsOct-10-2025, 19:10:44 GMT

Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?

Our findings on NoRa dataset reveal a prevalent vulnerability to such noise among current LLMs, with existing robust methods like self-correction and self-consistency showing limited efficacy.

dataset, digit, rationale, (16 more...)

Country:

Europe > France (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(5 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (0.92)

Industry:

Leisure & Entertainment > Games (1.00)
Health & Medicine > Therapeutic Area (0.67)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(3 more...)

Neural Information Processing SystemsOct-2-2025, 04:42:14 GMT

Dying Experts: Efficient Algorithms with Optimal Regret Bounds Hamid Shayestehmanesh Department of Computer Science University of Victoria Sajjad Azami

V arious results suggest that achieving optimal regret in the fully adversarial sleeping experts problem is computationally hard.

algorithm, hedge, learner, (15 more...)

Country:

North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceSep-18-2025

Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation

Zhang, Enci, Yan, Xingang, Lin, Wei, Zhang, Tianxiang, Lu, Qianchun

Despite impressive progress in areas like mathematical reasoning, large language models still face significant challenges in consistently solving complex problems. Drawing inspiration from key human learning strategies, we propose two novel strategies to enhance the capability of large language models to solve these complex problems. First, Adaptive Difficulty Curriculum Learning (ADCL) is a novel curriculum learning strategy that tackles the Difficulty Shift phenomenon (i.e., a model's perception of problem difficulty dynamically changes during training) by periodically re-estimating difficulty within upcoming data batches to maintain alignment with the model's evolving capabilities. Second, Expert-Guided Self-Reformulation (EGSR) is a novel reinforcement learning strategy that bridges the gap between imitation learning and pure exploration by guiding models to reformulate expert solutions within their own conceptual framework, rather than relying on direct imitation, fostering deeper understanding and knowledge assimilation. Extensive experiments on challenging mathematical reasoning benchmarks, using Qwen2.5-7B as the base model, demonstrate that these human-inspired strategies synergistically and significantly enhance performance. Notably, their combined application improves performance over the standard Zero-RL baseline by 10% on the AIME24 benchmark and 16.6% on AIME25.

large language model, machine learning, reinforcement learning, (16 more...)

2505.08364

Country:

Asia (0.68)
North America > Mexico (0.28)

Genre: Research Report (0.64)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsAug-19-2025, 17:59:10 GMT

Predicting Label Distribution from Multi-label Ranking

It is obvious that Eq. (5) holds for k = 2 . The information of the datasets we used is shown in Table 1. The first four rows in Table 1 are the existing label distribution datasets; the last three rows in Table 1 are the datasets we created. Since some examples in the original label distribution datasets do not satisfy the prerequisites of our paper (i.e., there are some examples

artificial intelligence, machine learning, relevant label, (15 more...)

Country: Asia > China > Jiangsu Province > Nanjing (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Neural Information Processing SystemsAug-14-2025, 16:33:16 GMT

Diversity Enhanced Active Learning with Strictly Proper Scoring Rules: Appendix

This appendix contains further details regarding the theory and experiments.

backbone classifier, dataset, learning curve, (10 more...)

Country: Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment > Games (0.42)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Salah, Ahmed, Yevick, David

Tracing the Path to Grokking: Embeddings, Dropout, and Network Activation

arXiv.org Artificial IntelligenceJul-17-2025

Grokking refers to delayed generalization in which the increase in test accuracy of a neural network occurs appreciably after the improvement in training accuracy This paper introduces several practical metrics including variance under dropout, robustness, embedding similarity, and sparsity measures, that can forecast grokking behavior. Specifically, the resilience of neural networks to noise during inference is estimated from a Dropout Robustness Curve (DRC) obtained from the variation of the accuracy with the dropout rate as the model transitions from memorization to generalization. The variance of the test accuracy under stochastic dropout across training checkpoints further exhibits a local maximum during the grokking. Additionally, the percentage of inactive neurons decreases during generalization, while the embeddings tend to a bimodal distribution independent of initialization that correlates with the observed cosine similarity patterns and dataset symmetries. These metrics additionally provide valuable insight into the origin and behaviour of grokking.

accuracy, artificial intelligence, machine learning, (17 more...)

2507.11645

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Avhad, Priyanka, Kshirsagar, Vedanti, Ranjan, Urvi, Nakhua, Mahek

Devising a solution to the problems of Cancer awareness in Telangana

arXiv.org Artificial IntelligenceJun-27-2025

According to the data, the percent of women who underwent screening for cervical cancer, breast and oral cancer in Telangana in the year 2020 was 3.3 percent, 0.3 percent and 2.3 percent respectively. Although early detection is the only way to reduce morbidity and mortality, people have very low awareness about cervical and breast cancer signs and symptoms and screening practices. We developed an ML classification model to predict if a person is susceptible to breast or cervical cancer based on demographic factors. We devised a system to provide suggestions for the nearest hospital or Cancer treatment centres based on the users location or address. In addition to this, we can integrate the health card to maintain medical records of all individuals and conduct awareness drives and campaigns. For ML classification models, we used decision tree classification and support vector classification algorithms for cervical cancer susceptibility and breast cancer susceptibility respectively. Thus, by devising this solution we come one step closer to our goal which is spreading cancer awareness, thereby, decreasing the cancer mortality and increasing cancer literacy among the people of Telangana.

artificial intelligence, cancer, machine learning, (15 more...)

2506.215

Country:

Asia > India > Telangana (0.82)
Asia > India > Maharashtra > Mumbai (0.05)
South America > Venezuela > Capital District > Caracas (0.04)

Genre: Research Report > Experimental Study (0.66)

Industry: Health & Medicine > Therapeutic Area > Oncology > Cervical Cancer (0.99)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.35)

arXiv.org Artificial IntelligenceMay-28-2025

Accelerating RL for LLM Reasoning with Optimal Advantage Regression

Brantley, Kianté, Chen, Mingyu, Gao, Zhaolin, Lee, Jason D., Sun, Wen, Zhan, Wenhao, Zhang, Xuezhou

Reinforcement learning (RL) has emerged as a powerful tool for fine-tuning large language models (LLMs) to improve complex reasoning abilities. However, state-of-the-art policy optimization methods often suffer from high computational overhead and memory consumption, primarily due to the need for multiple generations per prompt and the reliance on critic networks or advantage estimates of the current policy. In this paper, we propose $A$*-PO, a novel two-stage policy optimization framework that directly approximates the optimal advantage function and enables efficient training of LLMs for reasoning tasks. In the first stage, we leverage offline sampling from a reference policy to estimate the optimal value function $V$*, eliminating the need for costly online value estimation. In the second stage, we perform on-policy updates using a simple least-squares regression loss with only a single generation per prompt. Theoretically, we establish performance guarantees and prove that the KL-regularized RL objective can be optimized without requiring complex exploration strategies. Empirically, $A$*-PO achieves competitive performance across a wide range of mathematical reasoning benchmarks, while reducing training time by up to 2$\times$ and peak memory usage by over 30% compared to PPO, GRPO, and REBEL. Implementation of $A$*-PO can be found at https://github.com/ZhaolinGao/A-PO.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

2505.20686

Genre: Research Report > New Finding (0.67)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)